The packages used in this tutorial can be installed with the following commands:
install.packages("tidyverse")
install.packages("raster")
install.packages("spData")
install.packages("devtools")
devtools::install_github("Nowosad/spDataLarge")Let’s start by loading the packages we will require for this section of the workshop:
library("tidyverse")
library("raster")
library("spData")
library("spDataLarge")The geographic raster data model usually consists of a raster header and a matrix (with rows and columns) representing equally spaced cells (often also called pixels; Figure panel A).
The raster header defines the coordinate reference system, the extent and the origin. The origin (or starting point) is frequently the coordinate of the lower-left corner of the matrix (the raster package, however, uses the upper left corner, by default (Figure panel B)). The header defines the extent via the number of columns, the number of rows and the cell size resolution. Hence, starting from the origin, we can easily access and modify each single cell by either using the ID of a cell (Figure @ref(fig:raster-intro-plot):B) or by explicitly specifying the rows and columns. This matrix representation avoids storing explicitly the coordinates for the four corner points (in fact it only stores one coordinate, namely the origin) of each cell corner as would be the case for rectangular vector polygons. This and map algebra makes raster processing much more efficient and faster than vector data processing. However, in contrast to vector data, the cell of one raster layer can only hold a single value. The value might be numeric or categorical (Figure panel C).
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
Raster data types: (A) cell IDs, (B) cell values, (C) a colored raster map.
Raster maps usually represent continuous phenomena such as elevation, temperature, population density or spectral data. Of course, we can represent discrete features such as soil or land-cover classes also with the help of a raster data model. Consequently, the discrete borders of these features become blurred, and depending on the spatial task a vector representation might be more suitable.
Examples of continuous and categorical rasters.
The raster package supports raster objects in R. It provides an extensive set of functions to create, read, export, manipulate and process raster datasets. Aside from general raster data manipulation, raster provides many low-level functions that can form the basis to develop more advanced raster functionality. raster also lets you work on large raster datasets that are too large to fit into the main memory. In this case, raster provides the possibility to divide the raster into smaller chunks (rows or blocks), and processes these iteratively instead of loading the whole raster file into RAM (for more information, please refer to vignette("functions", package = "raster").
For the illustration of raster concepts, we will use datasets from the spDataLarge (note these packages were loaded at the beginning of this chapter). It consists of a few raster objects and one vector object covering an area of the Zion National Park (Utah, USA). For example, srtm.tif is a digital elevation model of this area (for more details, see its documentation ?srtm). First, let’s create a RasterLayer object named new_raster:
raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
new_raster = raster(raster_filepath)Typing the name of the raster into the console, will print out the raster header (extent, dimensions, resolution, CRS) and some additional information (class, data source name, summary of the raster values):
new_raster
#> class : RasterLayer
#> dimensions : 457, 465, 212505 (nrow, ncol, ncell)
#> resolution : 0.000833, 0.000833 (x, y)
#> extent : -113, -113, 37.1, 37.5 (xmin, xmax, ymin, ymax)
#> coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
#> data source : /home/robin/R/x86_64-pc-linux../3.5/spDataLarge/raster/srtm.tif
#> names : srtm
#> values : 1024, 2892 (min, max)Dedicated functions report each component: dim(new_raster) returns the number of rows, columns and layers; the ncell() function the number of cells (pixels); res() the raster’s spatial resolution; extent() its spatial extent; and crs() its coordinate reference system (raster reprojection is covered in Section @ref(reprojecting-raster-geometries)). inMemory() reports whether the raster data is stored in memory (the default) or on disk.
help("raster-package") returns a full list of all available raster functions.
Similar to the sf package, raster also provides plot() methods for its own classes.
plot(new_raster)Basic raster plot.
There are several other approaches for plotting raster data in R that are outside the scope of this section, including:
spplot() and levelplot() (from the sp and rasterVis packages, respectively) to create facets, a common technique for visualizing change over time.The RasterLayer class represents the simplest form of a raster object, and consists of only one layer. The easiest way to create a raster object in R is to read-in a raster file from disk or from a server.
raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
new_raster = raster(raster_filepath)The raster package supports numerous drivers with the help of rgdal. To find out which drivers are available on your system, run raster::writeFormats() and rgdal::gdalDrivers().
Rasters can also be created from scratch using the raster() function. This is illustrated in the subsequent code chunk, which results in a new RasterLayer object. The resulting raster consists of 36 cells (6 columns and 6 rows specified by nrows and ncols) centered around the Prime Meridian and the Equator (see xmn, xmx, ymn and ymx parameters). The CRS is the default of raster objects: WGS84. This means the unit of the resolution is in degrees which we set to 0.5 (res). Values (vals) are assigned to each cell: 1 to cell 1, 2 to cell 2, and so on. Remember: raster() fills cells row-wise (unlike matrix()) starting at the upper left corner, meaning the top row contains the values 1 to 6, the second 7 to 12, etc.
new_raster2 = raster(nrows = 6, ncols = 6, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = 1:36)For other ways of creating raster objects, see ?raster.
Aside from RasterLayer, there are two additional classes: RasterBrick and RasterStack. Both can handle multiple layers, but differ regarding the number of supported file formats, type of internal representation and processing speed.
Here we will cover RasterStack objects because they may be particularly useful to a number of approaches we might use (such as species distribution modeling!). To find out more about RasterBrick objects, type ?raster::brick.
A RasterStack allows you to connect several raster objects stored in different files or multiple objects in memory. More specifically, a RasterStack is a list of RasterLayer objects with the same extent and resolution. Hence, one way to create it is with the help of spatial objects already existing in R’s global environment. And again, one can simply specify a path to a file stored on disk.
multi_raster_file = system.file("raster/landsat.tif", package = "spDataLarge")
r_multi = brick(multi_raster_file)
raster_on_disk = raster(r_multi, layer = 1)
raster_in_memory = raster(xmn = 301905, xmx = 335745,
ymn = 4111245, ymx = 4154085,
res = 30)
values(raster_in_memory) = sample(seq_len(ncell(raster_in_memory)))
crs(raster_in_memory) = crs(raster_on_disk)r_stack = stack(raster_in_memory, raster_on_disk)
r_stack
#> class : RasterStack
#> dimensions : 1428, 1128, 1610784, 2
#> resolution : 30, 30
#> ...
#> names : layer, landsat.1
#> min values : 1, 7550
#> max values : 1610784, 19071RasterStack allows calculations based on many files, many Raster* objects, or both.
In contrast to the vector data model underlying simple features (which represents points, lines and polygons as discrete entities in space), raster data represent continuous surfaces. This section shows how raster objects work by creating them from scratch. Because of their unique structure, subsetting and other operations on raster datasets work in a different way, as demonstrated in Section @ref(raster-subsetting).
The following code recreates the raster dataset used in Section @ref(raster-classes). This demonstrates how the raster() function works to create an example raster named elev (representing elevations).
elev = raster(nrows = 6, ncols = 6, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = 1:36)The result is a raster object with 6 rows and 6 columns (specified by the nrow and ncol arguments), and a minimum and maximum spatial extent in x and y direction (xmn, xmx, ymn, ymax). The vals argument sets the values that each cell contains: numeric data ranging from 1 to 36 in this case. Raster objects can also contain categorical values of class logical or factor variables in R. The following code creates a raster representing grain sizes:
grain_order = c("clay", "silt", "sand")
grain_char = sample(grain_order, 36, replace = TRUE)
grain_fact = factor(grain_char, levels = grain_order)
grain = raster(nrows = 6, ncols = 6, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = grain_fact)raster objects can contain values of class numeric, integer, logical or factor, but not character. To use character values, they must first be converted into an appropriate class, for example using the function factor(). The levels argument was used in the preceding code chunk to create an ordered factor: clay < silt < sand in terms of grain size. See the Data structures chapter of @wickham_advanced_2014 for further details on classes.
raster objects represent categorical variables as integers, so grain[1, 1] returns a number that represents a unique identifier, rather than “clay”, “silt” or “sand”. The raster object stores the corresponding look-up table or “Raster Attribute Table” (RAT) as a data frame in a new slot named attributes, which can be viewed with ratify(grain) (see ?ratify() for more information). Use the function levels() for retrieving and adding new factor levels to the attribute table:
levels(grain)[[1]] = cbind(levels(grain)[[1]], wetness = c("wet", "moist", "dry"))
levels(grain)## [[1]]
## ID VALUE wetness
## 1 1 clay wet
## 2 2 silt moist
## 3 3 sand dry
This behavior demonstrates that raster cells can only possess one value, an identifier which can be used to look up the attributes in the corresponding attribute table (stored in a slot named attributes). This is illustrated by the command below, which returns the grain size and wetness of cell IDs 1, 11 and 35:
factorValues(grain, grain[c(1, 11, 35)])## VALUE wetness
## 1 sand dry
## 2 silt moist
## 3 clay wet
## Warning: package 'tmap' was built under R version 3.5.3
## Warning: package 'sf' was built under R version 3.5.2
Raster datasets with numeric (left) and categorical values (right).
Raster subsetting is done with the base R operator [, which accepts a variety of inputs:
Here, we only show the first two options since these can be considered non-spatial operations. If we need a spatial object to subset another or the output is a spatial object, we refer to this as spatial subsetting. Therefore, the latter two options will be shown below (see Section ‘Spatial subsetting’).
The first two subsetting options are demonstrated in the commands below — both return the value of the top left pixel in the raster object elev (results not shown):
# row 1, column 1
elev[1, 1]
# cell ID 1
elev[1]To extract all values or complete rows, you can use values() and getValues(). For multi-layered raster objects stack or brick, this will return the cell value(s) for each layer. For example, stack(elev, grain)[1] returns a matrix with one row and two columns — one for each layer. For multi-layer raster objects another way to subset is with raster::subset(), which extracts layers from a raster stack or brick. The [[ and $ operators can also be used:
r_stack = stack(elev, grain)
names(r_stack) = c("elev", "grain")
# three ways to extract a layer of a stack
raster::subset(r_stack, "elev")
r_stack[["elev"]]
r_stack$elevCell values can be modified by overwriting existing values in conjunction with a subsetting operation. The following code chunk, for example, sets the upper left cell of elev to 0:
elev[1, 1] = 0
elev[]## [1] 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
## [24] 24 25 26 27 28 29 30 31 32 33 34 35 36
Leaving the square brackets empty is a shortcut version of values() for retrieving all values of a raster. Multiple cells can also be modified in this way:
elev[1, 1:2] = 0raster contains functions for extracting descriptive statistics for entire rasters. Printing a raster object to the console by typing its name returns minimum and maximum values of a raster. summary() provides common descriptive statistics (minimum, maximum, quartiles and number of NAs). Further summary operations such as the standard deviation (see below) or custom summary statistics can be calculated with cellStats().
cellStats(elev, sd)summary() and cellStats() functions with a raster stack or brick object, they will summarize each layer separately, as can be illustrated by running: summary(brick(elev, grain)).
Raster value statistics can be visualized in a variety of ways. Specific functions such as boxplot(), density(), hist() and pairs() work also with raster objects, as demonstrated in the histogram created with the command below (not shown):
hist(elev)In case a visualization function does not work with raster objects, one can extract the raster data to be plotted with the help of values() or getValues().
Descriptive raster statistics belong to the so-called global raster operations. These and other typical raster processing operations are part of the map algebra scheme, and are covered below (Section @ref(map-algebra)).
Some function names clash between packages (e.g., select(), as discussed in a previous note). In addition to not loading packages by referring to functions verbosely (e.g., dplyr::select()), another way to prevent function names clashes is by unloading the offending package with detach(). The following command, for example, unloads the raster package (this can also be done in the package tab which resides by default in the right-bottom pane in RStudio): detach(“package:raster”, unload = TRUE, force = TRUE). The force argument makes sure that the package will be detached even if other packages depend on it. This, however, may lead to a restricted usability of packages depending on the detached package, and is therefore not recommended.
This section builds on Section @ref(manipulating-raster-objects), which highlights various basic methods for manipulating raster datasets, to demonstrate more advanced and explicitly spatial raster operations, and uses the objects elev and grain manually created in Section @ref(manipulating-raster-objects). For the reader’s convenience, these datasets can be also found in the spData package.
Above, we demonstrated how to retrieve values associated with specific cell IDs or row and column combinations. Raster objects can also be extracted by location (coordinates) and other spatial objects. To use coordinates for subsetting, one can ‘translate’ the coordinates into a cell ID with the raster function cellFromXY(). An alternative is to use raster::extract() (be careful, there is also a function called extract() in the tidyverse) to extract values. Both methods are demonstrated below to find the value of the cell that covers a point located 0.1 units from the origin.
id = cellFromXY(elev, xy = c(0.1, 0.1))
elev[id]
# the same as
raster::extract(elev, data.frame(x = 0.1, y = 0.1))It is convenient that both functions also accept objects of class Spatial* Objects. Raster objects can also be subset with another raster object, as illustrated in the figure below (left panel) and demonstrated in the following code chunk:
clip = raster(xmn = 0.9, xmx = 1.8, ymn = -0.45, ymx = 0.45,
res = 0.3, vals = rep(1, 9))
elev[clip]## [1] 18 24
# we can also use extract
# extract(elev, extent(clip))Basically, this amounts to retrieving the values of the first raster (here: elev) falling within the extent of a second raster (here: clip).
Subsetting raster values with the help of another raster (left). Raster mask (middle). Output of masking a raster (right).
So far, the subsetting returned the values of specific cells, however, when doing spatial subsetting, one often also expects a spatial object as an output. To do this, we can use again the [ when we additionally set the drop parameter to FALSE. To illustrate this, we retrieve the first two cells of elev as an individual raster object. As mentioned in Section @ref(manipulating-raster-objects), the [ operator accepts various inputs to subset rasters and returns a raster object when drop = FALSE. The code chunk below subsets the elev raster by cell ID and row-column index with identical results: the first two cells on the top row (only the first 2 lines of the output is shown):
elev[1:2, drop = FALSE] # spatial subsetting with cell IDs
elev[1, 1:2, drop = FALSE] # spatial subsetting by row,column indices
#> class : RasterLayer
#> dimensions : 1, 2, 2 (nrow, ncol, ncell)
#> ...Another common use case of spatial subsetting is when a raster with logical (or NA) values is used to mask another raster with the same extent and resolution, as illustrated in the figure above, middle and right panel. In this case, the [, mask() and overlay() functions can be used (results not shown):
# create raster mask
rmask = elev
values(rmask) = sample(c(NA, TRUE), 36, replace = TRUE)
# spatial subsetting
elev[rmask, drop = FALSE] # with [ operator
mask(elev, rmask) # with mask()
overlay(elev, rmask, fun = "max") # with overlayIn the code chunk above, we have created a mask object called rmask with values randomly assigned to NA and TRUE. Next, we want to keep those values of elev which are TRUE in rmask. In other words, we want to mask elev with rmask. These operations are in fact Boolean local operations since we compare cell-wise two rasters. The next subsection explores these and related operations in more detail.
Map algebra makes raster processing really fast. This is because raster datasets only implicitly store coordinates. To derive the coordinate of a specific cell, we have to calculate it using its matrix position and the raster resolution and origin. For the processing, however, the geographic position of a cell is barely relevant as long as we make sure that the cell position is still the same after the processing (one-to-one locational correspondence). Additionally, if two or more raster datasets share the same extent, projection and resolution, one could treat them as matrices for the processing. This is exactly what map algebra is doing in R. First, the raster package checks the headers of the rasters on which to perform any algebraic operation, and only if they are correspondent to each other, the processing goes on. And secondly, map algebra retains the so-called one-to-one locational correspondence. This is where it substantially differs from matrix algebra which changes positions when for example multiplying or dividing matrices.
Map algebra (or cartographic modeling) divides raster operations into four subclasses [@tomlin_geographic_1990], with each working on one or several grids simultaneously:
This typology classifies map algebra operations by the number/shape of cells used for each pixel processing step. For the sake of completeness, we should mention that raster operations can also be classified by discipline such as terrain, hydrological analysis or image classification. The following sections explain how each type of map algebra operations can be used, with reference to worked examples (also see vignette("Raster") for a technical description of map algebra).
Local operations comprise all cell-by-cell operations in one or several layers. A good example is the classification of intervals of numeric values into groups such as grouping a digital elevation model into low (class 1), middle (class 2) and high elevations (class 3). Using the reclassify() command, we need first to construct a reclassification matrix, where the first column corresponds to the lower and the second column to the upper end of the class. The third column represents the new value for the specified ranges in column one and two. Here, we assign the raster values in the ranges 0–12, 12–24 and 24–36 are reclassified to take values 1, 2 and 3, respectively.
rcl = matrix(c(0, 12, 1, 12, 24, 2, 24, 36, 3), ncol = 3, byrow = TRUE)
recl = reclassify(elev, rcl = rcl)Raster algebra is another classical use case of local operations. Many generic functions that allow for simple and elegant raster algebra have been implemented for Raster objects, including the normal algebraic operators such as +, -, *, /, logical operators such as >, >=, <, ==, ! and functions like abs, round, ceiling, floor, trunc, sqrt, log, log10, exp, cos, sin, atan, tan, max, min, range, prod, sum, any, all. In these functions you can mix raster objects with numbers, as long as the first argument is a raster object. Raster algebra also allows logical operations such as finding all raster cells that are greater than a specific value (5 in our example below).
The raster package supports all these operations and more, as described in vignette("Raster") and demonstrated below (results not show):
elev + elev
elev^2
log(elev)
elev > 5A really useful feature of RasterStack objects is that you can generate cell-wise summaries across the entire stack. Let’s create some raster stacks to have a look at simple examples:
elev_stack = stack(elev, elev+1, elev+4, elev+10)Summary functions (min, max, mean, prod, sum, Median, cv, range, any, all) always return a RasterLayer object.
mean(elev_stack)## class : RasterLayer
## dimensions : 6, 6, 36 (nrow, ncol, ncell)
## resolution : 0.5, 0.5 (x, y)
## extent : -1.5, 1.5, -1.5, 1.5 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : layer
## values : 3.75, 39.75 (min, max)
min(elev_stack)## class : RasterLayer
## dimensions : 6, 6, 36 (nrow, ncol, ncell)
## resolution : 0.5, 0.5 (x, y)
## extent : -1.5, 1.5, -1.5, 1.5 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : layer
## values : 0, 36 (min, max)
range(elev_stack)## class : RasterBrick
## dimensions : 6, 6, 36, 2 (nrow, ncol, ncell, nlayers)
## resolution : 0.5, 0.5 (x, y)
## extent : -1.5, 1.5, -1.5, 1.5 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : range_min, range_max
## min values : 0, 10
## max values : 36, 46
Use cellStats if instead of a RasterLayer you want a single number summarizing the cell values of each layer.
cellStats(elev_stack, 'sum')## layer.1 layer.2 layer.3 layer.4
## 663 699 807 1023
Instead of arithmetic operators, one can also use the calc() and overlay() functions. These functions are more efficient, hence, they are preferable in the presence of large raster datasets. Additionally, they allow you to directly store an output file.
The calculation of the normalized difference vegetation index (NDVI) is a well-known local (pixel-by-pixel) raster operation. It returns a raster with values between -1 and 1; positive values indicate the presence of living plants (mostly > 0.2). NDVI is calculated from red and near-infrared (NIR) bands of remotely sensed imagery, typically from satellite systems such as Landsat or Sentinel. Vegetation absorbs light heavily in the visible light spectrum, and especially in the red channel, while reflecting NIR light, explaining the NVDI formula:
\[ \begin{split} NDVI&= \frac{\text{NIR} - \text{Red}}{\text{NIR} + \text{Red}}\\ \end{split} \]
While local functions operate on one cell, though possibly from multiple layers, focal operations take into account a central cell and its neighbors. The neighborhood (also named kernel, filter or moving window) under consideration is typically of size 3-by-3 cells (that is the central cell and its eight surrounding neighbors), but can take on any other (not necessarily rectangular) shape as defined by the user. A focal operation applies an aggregation function to all cells within the specified neighborhood, uses the corresponding output as the new value for the the central cell, and moves on to the next central cell. Other names for this operation are spatial filtering and convolution.
In R, we can use the focal() function to perform spatial filtering. We define the shape of the moving window with a matrix whose values correspond to weights (see w parameter in the code chunk below). Secondly, the fun parameter lets us specify the function we wish to apply to this neighborhood. Here, we choose the minimum, but any other summary function, including sum(), mean(), or var() can be used.
r_focal = focal(elev, w = matrix(1, nrow = 3, ncol = 3), fun = min)Input raster (left) and resulting output raster (right) due to a focal operation - finding the minimum value in 3-by-3 moving windows.
We can quickly check if the output meets our expectations. In our example, the minimum value has to be always the upper left corner of the moving window (remember we have created the input raster by row-wise incrementing the cell values by one starting at the upper left corner). In this example, the weighting matrix consists only of 1s, meaning each cell has the same weight on the output, but this can be changed.
Focal functions or filters play a dominant role in image processing. Low-pass or smoothing filters use the mean function to remove extremes. In the case of categorical data, we can replace the mean with the mode, which is the most common value. By contrast, high-pass filters accentuate features. The line detection Laplace and Sobel filters might serve as an example here. Check the focal() help page for how to use them in R.
Zonal operations are similar to focal operations. The difference is that zonal filters can take on any shape instead of a predefined rectangular window. Our grain size raster is a good example because the different grain sizes are spread in an irregular fashion throughout the raster.
To find the mean elevation for each grain size class, we can use the zonal() command. This kind of operation is also known as zonal statistics in the GIS world.
z = zonal(elev, grain, fun = "mean") %>%
as.data.frame()
z## zone mean
## 1 1 17.58333
## 2 2 18.50000
## 3 3 19.16667
This returns the statistics for each category, here the mean altitude for each grain size class, and can be added to the attribute table of the ratified raster.
Global operations are a special case of zonal operations with the entire raster dataset representing a single zone. The most common global operations are descriptive statistics for the entire raster dataset such as the minimum or maximum (see Section @ref(summarizing-raster-objects)). Aside from that, global operations are also useful for the computation of distances. In the first case, one can calculate the distance from each cell to a specific target cell.
r1 = raster(ncol=100, nrow=100, xmn=0, xmx=100, ymn=0, ymx=100)
r1[] = NA # Assign NoData values to all pixels
r1[c(850, 5650)] = 1 # Change the pixels #850 and #5650 to 1
r1.d = distance(r1) In Section @ref(spatial-raster-subsetting) we have shown how to extract values from a raster overlaid by other spatial objects. To retrieve a spatial output, we can use almost the same subsetting syntax. The only difference is that we have to make clear that we would like to keep the matrix structure by setting the drop-parameter to FALSE. This will return a raster object containing the cells whose midpoints overlap with clip.
data("elev", package = "spData")
clip = raster(xmn = 0.9, xmx = 1.8, ymn = -0.45, ymx = 0.45,
res = 0.3, vals = rep(1, 9))
elev[clip, drop = FALSE]## class : RasterLayer
## dimensions : 2, 1, 2 (nrow, ncol, ncell)
## resolution : 0.5, 0.5 (x, y)
## extent : 1, 1.5, -0.5, 0.5 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : layer
## values : 18, 24 (min, max)
For the same operation we can also use the intersect() and crop() command.
When merging or performing map algebra on rasters, their resolution, projection, origin and/or extent have to match. Otherwise, how should we add the values of one raster with a resolution of 0.2 decimal degrees to a second raster with a resolution of 1 decimal degree? The same problem arises when we would like to merge satellite imagery from different sensors with different projections and resolutions. We can deal with such mismatches by aligning the rasters.
In the simplest case, two images only differ with regard to their extent. Following code adds one row and two columns to each side of the raster while setting all new values to an elevation of 1000 meters.
data(elev, package = "spData")
elev_2 = extend(elev, c(1, 2), value = 1000)
plot(elev_2)Original raster extended by one row on each side (top, bottom) and two columns on each side (right, left).
Performing an algebraic operation on two objects with differing extents in R, the raster package returns the result for the intersection, and says so in a warning.
elev_3 = elev + elev_2## Warning in elev + elev_2: Raster objects have different extents. Result for
## their intersection is returned
However, we can also align the extent of two rasters with extend(). Instead of telling the function how many rows or columns should be added (as done before), we allow it to figure it out by using another raster object. Here, we extend the elev object to the extent of elev_2. The newly added rows and column receive the default value of the value parameter, i.e., NA.
elev_4 = extend(elev, elev_2)The origin of a raster is the cell corner closest to the coordinates (0, 0). The origin() function returns the coordinates of the origin. In the below example a cell corner exists with coordinates (0, 0), but that is not necessarily the case.
origin(elev_4)## [1] 0 0
If two rasters have different origins, their cells do not overlap completely which would make map algebra impossible. To change the origin , use origin().1 Looking at the figure below reveals the effect of changing the origin.
# change the origin
origin(elev_4) = c(0.25, 0.25)
plot(elev_4)
# and add the original raster
plot(elev, add = TRUE)Rasters with identical values but different origins.
Note that changing the resolution frequently (next section) also changes the origin.
Raster datasets can also differ with regard to their spatial resolution. To match resolutions, one can either decrease (aggregate()) or increase (disaggregate()) the resolution of one raster. As an example, we here change the spatial resolution of new_raster by a factor of 12. Additionally, the output cell value should correspond to the mean of the input cells (note that one could use other functions as well, such as median(), sum(), etc.):
new_raster_agg = aggregate(new_raster, fact = 12, fun = mean)Original raster (left). Aggregated raster (right).
By contrast, thedisaggregate() function increases the resolution. However, we have to specify a method on how to fill the new cells. The disaggregate() function provides two methods. The first (nearest neighbor, method = "") simply gives all output cells the value of the nearest input cell, and hence duplicates values which leads to a blocky output image.
new_raster_disagg = disaggregate(new_raster_agg, fact = 12, method = "bilinear")
identical(new_raster, new_raster_disagg)## [1] FALSE
Comparing the values of new_raster and new_raster_disagg tells us that they are not identical (you can also use compareRaster() or all.equal()). However, this was hardly to be expected, since disaggregating is a simple interpolation technique. It is important to keep in mind that disaggregating results in a finer resolution; the corresponding values, however, are only as accurate as their lower resolution source.
If the origins of two raster datasets are just marginally apart, it sometimes is sufficient to simply increase the tolerance argument of raster::rasterOptions().↩